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I .  INTRODUCTION 


While  Grubbs  discusses  linear  statistical  regression  and  functional 
relations  in  a  general  way  in  BRL  Report  No.  1842,  Taylor  and  Moore 
explain  in  BRL  Report  No.  1986  the  use  of  Scheffe's  theorem*  in  providing 
simultaneous  confidence  bounds  for  a  polynomial  in  the  set-up  of  a 
general  linear  hypothesis  model  where  the  design  matrix  X  is  of  full 
rank.  The  aim  of  the  first  report  was,  as  it  appears,  to  provide  an 
introduction  to  the  subject  of  linear  regression  and  that  of  the  second, 
to  make  Scheffe's  theorem^  accessible  to  the  general  reader  in  the  set¬ 
up  of  a  nonsingular  design  matrix  X.  In  this  context,  one  may  also 


like  to  know  how  Scheffe's  theorem  could  be  used  when  the  design  matrix 
is  singular.  This  is  important,  since  most  of  the  Experimental  Design 
Models  are  characterized  by  singular  matrices.  In  the  perspective  of 
using  Scheffe's  theorem  when  the  design  matrix  X  is  singular,  it  may 
also  be  of  interest  to  know  how  the  basic  results  under  the  general 
linear  hypothesis  model  of  full  rank  would  extend  to  the  situation 
when  the  design  matrix  X  is  singular.  With  this  objective  in  mind,  some 
of  the  relevant  results  which  are  very  basic  in  the  theory  of  singular 
designs  are  brought  together  in  this  report  paving  the  background  with  an 
explanatory  introduction  to  the  solution  of  linear  equations  as  related 
to  the  concept  of  a  generalized  inverse  (g-inverse) . 


II.  THE  GENERAL  LINEAR  HYPOTHESIS  MODEL  OF  FULL  RANK 
A.  Characterization  of  the  Linear  Model 


Suppose  a  set  of  (p-1)  mathematical  variables,  x^,  x2,  ...,  xp 
are  related  to  a  random  variable  y  which  is  observed  with  an  unknown 
random  error  e  in  the  following  manner, 

y  =  30  +  BjXj  +  32x2  +  -  +  Vixp-i  +  £  C1 2 3) 


^Grubbs,  F.  E.,  "Linear  Statistical  Regression  and  Functional 
Relations,"  Ballistic  Research  Lab  Report  No.  1842,  AD  #A018651, 

1975. 

2 

Taylor,  M.S.,  and  Moore,  J.R.,  "Confidence  Bounds  for  the  General 
Linear  Model,"  Ballistic  Research  Lab  Report  No.  1986,  1977. (AD#A041035) 

3Scheffe,  H.,  "The  Analysis  of  Variance,"  John  Wiley  5  Sons,  Inc., 

New  York,  1959.  - 
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where  3 .  (j  =  0,1,2,...  p-1)  are  unknown  parameters  (constants).  Equa 

tion  (1) ,  which  is  linear  in  the  random  variables,  y  and  e, and  the 
parameters,  30,3r  TT^Tp.p  represents  a  linear  model,.  When  all 

the  observations  are  taken  in  accordance  with  this  model,  observing  a 
y  for  each  set  of  the  x's,  the  observational  equations  are  expressib 
in  matrix  notation  as 


y  =  X  3  +  e 


(2) 


With  n  observations,  y  is  an  nxl  column  vector  of  the  observed  y's, 

X  =  (x..),  (i  =  1,2,3“. ..n,  j  =  0,1,2, ...,p-l),  is  a  known  nxp  matrix 

with  x/!  as  its  (i,j)th  element;  3  is  a  pxl  vector  of  unknown  parameters, 

3 .  (j  =1f),  1, 2, . . .  ,p-l) ;  £  is  an  nxl  vector  of  unobserved  random  errors, 

(i  —  1 , 2 , 3 ,  •  • . , n)  • 

When  Rank  (X)  =  p,  we  say  that  the  model  is  of  full  rank.  In  the 
analysis  of  such  models  we  usually  consider  the  following  two  cases  co 
cerning  the  distribution  of  £. 

Case  1:  e  is  distributed  normally  with  mean  0^  and  covariance 

a2I,  a2  >  0,  where  I  is  the  nxn  identity  matrix.  (These  assumptions  are 
needed  for  tests  of  hypothesis  and  setting  confidence  bounds.) 


Case  2:  e  has  an  unknown  distribution  with  mean  0  and  covariance 
a2I.  (These  assumptions  are  referred  to  as  the  "Least  Squares”  assumptions.) 

B-  Some  of  the  Basic  Results  Relevant  to  the  Full  Rank  Model 

The  normal  equations  to  provide  the  least  squares  point  estimate 

A 

3  of  3  are  given  by 

(X'X)3  =  X'y  (3) 

=>  3  =  (X'X)  -1X'y  .  C4) 


The  equations  to  provide  the  maximum  likelihood  estimate  of^3  for  Case  1 

will  be  the  same.  The  least  squares  point  estimate  a  of  c  (the  same 
for  maximum  likelihood  estimate,  when  adjusted  for  bias)  is  given  by 

a2  =  (y-X3) '  (y-X3)  =  ^  [y'y-3'x’y]  •  & 

a.  Ao  9 

The  estimates  3  and  a  are  unbiased.  That  is  E(3)  =  c(a  )  =  o  , 
where  the  symbol  E  stands  for  mathematical  expectation.  The  least  squares 
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estimate  _£'3_  of  Ji'3_,  where  £_  is  any  pxl  vector  of  constants,  is  given  by 

£'3_  =  V  (X'X)"1  X'y  (6) 

A 

and  the  variance  of  £'3  by 


Var  (£'&)  =  02£' (X'X) ~1£ 


C7) 


In  particular,  £'  may  be  any  given  observation  vector  (1,  x. , 

A  1 


’  *»  Xp-P  * 

Under  the  assumptions  of  Case  1,  3_  has  the  multivariate  normal  distribu- 

.  .  2  -1 
tion  with  mean  3>  and  variance  a  S  ,  where  S  =  (X'X). 


C.  Estimation  of  a  Parametric  Function  When  the  Design  Matrix  is  Singular 

When  the  design  matrix  X  is  of  full  rank,  that  is,  when  Rank  (X)  =  p, 
we  can  provide,  as  noted  above,  an  estimate  for  each  element  of  3>  and 

A  - 

therefore,  also  an  estimate  £_'3_  for  £.'3_,  for  any  pxl  vector  £_.  However, 
when  the  design  matrix  X  is  not  of  full  rank,  that  is,  when  Rank  (X)  =  r, 
r<p,  implying  that  Rank  (X'X)  =  r,  we  cannot  provide  unique  solutions  to 
the  normal  equations  (3) .  Unique  solutions  given  in  (4)  require  a 
regular  inverse  of  (X'X).  Since  (X'X)  is  not  of  full  rank,  we  cannot 
compute  a  regular  inverse.  None-the-less,  the  normal  equations  (3) 
can  still  be  solved,  and  some  of  the  results  given  in  the  preceeding 
section  can  still  be  obtained  in  analogous  forms  by  using  what  is  called 
a  generalized  inverse  (g-inverse  or  a  pseudo-inverse)  of  the  matrix 
(X'X).  In  order  to  make  this  report  self-contained,  we  provide  in  the 
sequel  an  elementary  introduction  to  the  solution  of  numerical  equations 
as  related  to  the  concept  of  a  "generalized  inverse"  (g-inverse) . 

When  the  model  is  of  less  than  the  full  rank,  we  can  provide 
unbiased  estimates  of  some  specific  linear  functions  of  the  parameters. 

Such  functions  are  called  estimable  functions.  This  brings  in  the  defi¬ 
nition  of  estimability. 

A  linear  parametric  function,  \p  =  c'3  =  cn3.  +  c03„  +  ...  +  c  3  , 

XX  2  2  P  P 

is  said  to  be  estimable,  if  and  only  if  it  has  an  unbiased  linear  estimate 

A  A 

That  is,  there  should  exist  an  nxl  vector  a_,  such  that  E(ty)  = 

E(a'£)  =  ip,  which  in  turn  implies  that  a'X  3_  =  c_'3_,  for  all  3_.  Hence, 
should  be  of  the  form  a.'X  which  is  a  row  vector  in  the  row  space  of  X. 

D.  Scheffe's  Theorem 


We  note  here  again  that  all  linear  parametric  functions  are  uniquely 
estimable  in  the  full  rank  model.  In  such  a  situation,  we  can  think  of 
p  independent  linear  functions  of  the  parameters  given  as  ip^,  •••>  , 
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forming  what  may  be  called  a  basis  of  the  p-dimensional  space  L  of  the 
parametric  functions.  Thus,  in  the  full  rank  model  under  Case  1,  Scheffe's 
theorem  on  simultaneous  confidence  bounds  will  read  as  follows: 

Theorem  1:  The  probability  is  1-a  (where  a  is  the  size  of  the 
associated  test  of  hypothesis)  that  the  values  of  all  parametric  functions 
ip  e  L  simultaneously  satisfy  the  inequalities: 

}  -  So~  <  ip  <  }  +  ,  (8) 

2 

where  S  =  [pF  1 ,  F  denoting  Snedecor's  F  distribution  with  p 

^ )  P  )  /\  L  ^ 

and  (n-p)  degrees  of  freedom,  and  a*  is  the  estimated  standard  error  of  \p. 

Ir  one  is  interested  in  a  set  of  q  (q<p)  independent  parametric 
functions,  ip'  =  (rp^,  .  ..,  \p^) ,  S  will  change  to  [qFa.  n  ],  making 
the  confidence  interval  shorter. 

The  likelihood  ratio  test  which  provides  the  simultaneous  confidence 
bounds  (8)  for  provides  simultaneous  bounds  also  on  all  linear 

combinations  of  ^(i  =  1,  2,  . . . ,  q) ,  Vip  =  cf>,  where  l  is  any  qxl 

column  vector  of  constants  (see  [3,8]).  The  corresponding  theorem  on 
confidence  bounds  will  then  read  as  follows: 

Theorem  2:  The  probability  is  1-a  (where  a  is  the  size  of  the 
associated  test  of  hypothesis)  that  the  values  of  all  possible  linear 
combinations,  (p,  of  the  linear  parametric  functions  simultaneously 
satisfy  the  inequalities: 


where  S 


2 


<P  ~  So-  <  <p  <  rjj  +  So- 


[qp, 


a;  q,  n-p 


(9) 


In  order  to  apply  Scheffe's  theorem  in  the  context  of  a  singular 
model,  we  need  the  following  introduction  to  the  solution  of  a  system 
°f  linear  equations  as  related  to  the  concept  and  computation  of  the 
generalized  inverse  (g-inverse)  of  a  matrix. 
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III.  ON  THE  SOLUTION  OF  A  SYSTEM  OF  LINEAR  EQUATIONS 


A*  Necessary  Operations  Required  to  Solve  Linear  Equations  with  a 

Singular  Coefficient  Matrix. 

A  system  of  n  linear  equations  in  n  unknowns  may  be  written  as 


Ax  =  y_  (10) 

where  the  coefficient  matrix  A  is  an  nxn  matrix  of  known  constants,  x  is 
a  nxl  vector  of  the  unknown  variables,  and  y  is  a  vector  of  known  constants. 

When  Rank  (A)  =  n,  the  solution  of  the  equations  is  obtained  as  x  =  A-^y. 

When  Rank  (A)  =  p<n,  A  does  not  exist.  But  the  system  of  equations  may 
still  have  a  solution,  when  the  equations  are  consistent.  Solutions  for 
such  a  system  of  equations  exist,  when  and  only  when  Rank  (A)  =  Rank  (A:j0, 
which,  in  turn,  implies  that  £  lies  in  the  column  space  of  A.  This 
provides,  in  fact,  also  the  condition  for  consistency  of  the  equations. 

In  the  general  situation,  the  matrix  A  need  not  be  a  square  matrix. 

Ax  =  0^  of  equation  (10)  will  be  referred  to  as  the  homogeneous  part 
of  the  system.  As  we  know,  to  solve  a  system  of  equations,  any  one 
equation  can  be  multiplied  or  divided  by  a  constant  (other  than  0), 
and  that  any  two  equations  may  be  added,  or  one  equation  may  be  sub¬ 
tracted  from  another  without  affecting  the  solutions.  These  operations 
on  the  equations  may  be  performed  by  the  appropriate  operations  on  the 
rows  of  A  and  the  corresponding  elements  of  y. 

By^premultiplying  a  given  matrix  by  what  is  called  an  elementary 
matrix  E,  we  can  interchange  any  two  rows,  multiply  a  row  by  a  non-zero 
scalar,  or  replace  the  i  th  row  by  the  sum  of  the  i  th  row  and  c  times 
the  j  th  row.  These  elementary  matrices  are  obtained  from  the  identity 
matrices  of  appropriate  dimensions  after  performing  corresponding  opera¬ 
tions  on  the  identity  matrices.  Let  us  suppose  that  A  is  of  dimension 
3x3..  The  elementary  matrices  will  then  be  obtained  from  the  identity 
matrix  1^  of  dimension  3. 


The  elementary  matrix  E  to  interchange  the  first  and  second  row 
of  A  will  be  given  by 

r  — 

0  10 


E 


12 


10  0 


0  0  1 


E12  is  obtained  by  interchanging  the  first  and  the  second  row  of  I3> 
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The  elementary 
be  given  by 


matrix  to  multiply 
"  1  0 


the  second  row  of  A  by  3  will 
O' 


E2(3) 


0  3  0 

0  0  1 


E2(3)  is  obtained  by  multiplying  the  second  row  of  I3  by  3. 

The  elementary  matrix  to  replace  the  second  row  of  A  by  the  sum 
of  the  second  row  and  (-3)  times  the  third  row  will  be  giver,  by 


E23C-3)  = 


10  0 
0  1-3 

0  0  1 


E0,(-3)  is  obtained  by  replacing  the  second  row  of  I3  by  the  sum  of  the 
second  row  and  (-3)  times  the  third  row  of  Ij. 


If  A  is  a 

matrix  I  by  a 
P 


pxp  nonsingular  matrix,  we  can  reduce  A  to  the  identity 
finite  number  (say,  t)  of  row  operations  on  A.  That  is. 


Et  Et-1 


Et  Et-1 


Xt  is  then  observed  that  the  product  of  the  elementary  matrices  gives 
the  inverse  of  A,  when  A  is  nonsingular. 

It  may  be  pointed  out  that  post-multiplication  by  the  elementary 
matrices  E  obtained  from  the  column  operations  on  the  identity  matrix 
provides  the  corresponding  column  operations. 

g  A  Numerical  Illustration  of  Solving  a  System  of  Linear  Equations 

by  the  Sweep-out  Method. 

Given  below  is  a  system  of  3  linear  equations  in  4  variables. 
Here,  the  associated  coefficient  matrix  A  has  3  rows  and  4  columns. 
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(1).  Equations:  x ^  +  2x 2  +  3x3  +  x4  =  4 

4x,  +  5x0  +  6x_  +  2x.  =  5 

12  3  4 

8x,  +  13x0  +  18x_  +  6x.  =  21 

12  3  4 

Homogeneous  part  Non-homogeneous  part 


Xj  +  2x2  +  3x3  +  x4  =  0  ....  =4 

4Xj  +  5x2  +  6x3  +  2x4  =  0  ....  =5 

8Xj  +  13x2  +  18x3  +  6x4  =  0  ....  =21 

E32(-l),  E31(-4) 

12  3  1  ....  =4 


12  3  1  ....  =4 

4  5  6  2  ....  =5 

0  0  0  0  ....  =  0 


E2i(-4) 

1 

2 

3 

1 

•  •  •  • 

ss 

4 

0 

-3 

-6 

-2 

•  •  •  • 

= 

-11 

0 

0 

0 

0 

•  •  •  • 

= 

0 

e2C-D 

1 

2 

3 

1 

•  •  •  • 

= 

4 

0 

3 

6 

2 

•  •  •  • 

= 

11 

0 

0 

0 

0 

•  •  •  • 

s 

0 

E2(l/3) 

1 

2 

3 

1 

•  •  •  • 

= 

4 

0 

1 

2 

2/3 

•  •  •  # 

= 

11/3 

0 

0 

0 

0 

•  •  •  • 

= 

0 

ei2(-2) 

1 

0 

-1 

-1/3 

•  •  •  • 

= 

-10/3 

0 

1 

2 

2/3 

•  •  •  • 

= 

11/3 

0 

0 

0 

0 

•  •  •  • 

= 

0 
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The  method  of  elimination  adopted  above  is  sometimes  referred  to 
in  literature  as  the  method  of  "sweep-out."  The  method  of  "sweep-out" 
gives  us  a  basis  of  the  row  space  of  A  along  with  the  solutions,  if 
solutions  exist.  Solutions  will  exist,  if  when  a  row  is  swept  out 
leaving  0's  as  its  elements,  the  corresponding  element  of  y_  should  also 
go  to  0  in  the  "sweep-out"  procedure.  The  "sweep-out"  procedure, there¬ 
fore,  gives  us  also  a  way  of  finding  if  the  equations  are  consistent. 

The  third  row  has  been  swept  out,  retaining  only  two  equations  in 
four  variables.  The  first  row  could  have  been  swept  out,  and  the  last 
two  rows  raised  above,  retaining  the  same  form  of  the  reduced  coefficient 
matrix.  The  form  is  important.  It  may  be  pointed  out  that  other  kinds 
of  row  operations  could  also  have  been  performed,  retaining  the  same 
form  of  the  reduced  coefficient  matrix.  Now,  merely  from  an  inspection 
of  the  non-zero  part  of  the  reduced  coefficnent  matrix,  it  is  possible 
to  find  a  solution  vector  of  the  homogeneous  part,  a  vector  that  is 
orthogonal  to  a  row  of  the  reduced  matrix.  Such  solutions  are  indicated 
below. 

eneous  Part 


Solutions  shown  as  columns 
j  -1  -1/3 

i  2  2/3 

-1  0 

0  -1 


the  Equations  Ax  =  y 


(2). 


Solutions  of  the  Homo 


Basis 


Solns . 
shown  as  < 
rows  =>  j 

t. 


1 

0 

-  1 


L 


3 


0 

1 


2  - 

2 

3 


-1/3 

2/3 


0 

-1 


(3) .  A  Particular  Solution  of 


This  comes  from  the  nonhomogeneous  part,  and  is  obtained  by  adding  two 
zeros  to  the  reduced  part  of  the  vector  y.  If  there  are  four  equations 
in  two  variables,  we  can  assign  arbitrary  values  to  two  of  the  variables, 
and  solve  uniquely  for  the  remaining  two  variables.  In  this  case,  zero 
values  are  assigned  to  the  third  and  the  fourth  variables. 
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■  - 


(4). 

Particular  Solution 


General  Solution 

+  General  Solution  from  the  homogeneous  part 


X1 = 

-10/3 

/-1N- 

/ -1/3  \ 

X2  = 

11/3 

'  2 

2/3  '  j 

+ 

X 

+  P 

\ 

X3  = 

| 

0 

! 

-1 

0  /' 

1 

*4  * 

0 

L_  J 

\  0  1 

1 

-1  h 

X,  p  being  arbitrary  scalars. 

includLS^bV°i  be  di££ic?lt  t0  verifX  how  all  possible  solutions  are 
values  of  ?  u  general  solution .  Suppose  we  want  to  find  the 

values  of  and  x2  by  assigning  Cj  to  x3,  c2  to  instead  of  0's. 

This  solution  will  then  be  given  by  writing  X  =  -Cj,  and  p  =  -c^ 

enuivalmt^f  St6?  tSken  in  the  eliminati°n  process  shown  above  is 
Jib l  an  elementary  row  operation  provided  by  premultiplication 
with  elementary  matrices,  E..(c),  E.(k),  etc.  These  elementary  matrices 

are  indicated  prior  to  the  steps  taken. 

square  fi  ^  4x4?SSaryr  *  T°W  °f.°'s  may  be  added  to  make  the  matrix  A 
i  *  Correspondingly,  the  vector  y  may  be  made  to 
consist  of  4  elements,  with  the  fourth  element  as  zero. 

to  reduce  T »  ««*«"  “¥>*»« 


B  -  E12(‘2>  E2 

Cl/3)  e2(-i) 

E21 

-5/3 

2/3  0 

0 

4/3 

-1/3  0 

0 

-4 

-1  1 

o 

0 

0  0 

1  J 

is  of  dimension 

4x4. 
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One  may  also  introduce  "economy"  in  the  number  of  row  operations 
or  perform  other  row  operations  and  get  a  different  form  for  B.  B  has 
been  obtained  the  same  way  as  a  regular  inverse  of  A  is  sometimes  obtained, 

when  A  is  nonsingular.  B  =  A  ^  is  unique,  when  A  is  nonsingular.  But, 

B  is  not  unique,  when  A  is  singular.  Other  forms  of  B  could  be  found4 
reducing  A  to  the  above  standard  form. 


It  may  be  verified  that  the  particular  solution,  referred  to 
above,  is  given  by  B>^.  In  this  case,  the  last  column  of  B  is  redundant, 
as  it  does  not  contribute  to  the  solution.  The  last  column  of  B  has 
a  unity  in  the  fourth  row,  while  the  rest  of  the  elements  are  0's,  and 
y  has  a  0  in  the  fourth  row.  Omitting  the  fourth  column  of  B,  the  re¬ 
maining  4x3  matrix  can  be  taken  as  a  g-inverse  of  A.  Calling  it  B, 
a  solution  for  is  given  by  x  =  By. 


Although  B  is  not  unique,  the  particular  solution 
character.  It  should  be  evident  that  the  values  of  x^ 


x 


1 

4 


By_  has  a  unique 
and  x 0  come  from 


The  values  of  x^  and  X2  are  unique.  If,  for  example,  the  first  row  were 

swept  out,  retaining  the  second  and  the  third  rows,  the  values  of  x.  and 
X2  would  have  come  from  * 

-1 


1—  S 

X1 

‘4 

5" 

s" 

_X2_ 

8 

13 

21 

The  values  of  Xj  and  x2  are  the  same  as  above.  In  both  cases,  x3  and 
x4  are  assigned  zero  values.  (This  aspect  of  uniqueness  of  By  is 
referred  to  later  in  this  report.) 


C .  The  General  Solution  in  a  Compact  Form 

The  general  solution  to  the  equations  Ax  =  y  can  be  written  in  a 
compact  form  as 


*  =  By  +  (H-I)z,  (11) 

where  H  =  BA,  and  z_  is  any  arbitrary  vector.  The  first  part  of  the 

4 

Banerjee,  K.S.,  "Singularity  in  Hotelling's  Weighing  Designs  and 
a  Generalized  Inverse,"  Annals  of  Math.  Stat,  37,  4,  1021-31 
(1966).  (Correction  note:  Annals  of  Math,  Stat.,  40,  2,  719.) 
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solution,  B^,  is  what  has  been  termed  as  the  particular  solution,  while 
the  second  component  is  derived  from  the  homogeneous  part.  As  z  is 
arbitrary,  the  above  solutions  may  also  be  expressed  as  B^_  +  (I~H)_z.  We 
may  adopt  any  one  of  these  two  forms.  We  provide  below  a  derivation  of 
the  compact  form  of  the  general  solution  by  way  of  summarizing  what  has 
been  done  under  the  "sweep-out"  operations. 


D.  The  Derivation  of  the  General  Solution 


The  second  component  of  the  general  solution  comes  from  the  homogeneous 
part.  We  recall  that  a  matrix  B  exists  such  that  BA  =  H  where  H  is  of  a 
particular  form.  That  is. 

Ax  =  0  **■  Hx  =  0 


21  =  0.  The  solutions  are  given  by 


0  I.  0 


(12) 


Any  column  in  the  column  space  of  (12)  is  also  a  solution.  That  is. 


H12  ] 


-I 

n-P 


z 

— n-p 


is  also  a  solution,  where  is  a  vector  of  (n-p)  elements.  Also, 


— 

°pxp  |  H12 

r 

L _ ! 

_P(n-p)xpj  -Inp 

Z 

(H-I)z 


is  a  solution,  where  z  is  arbitrary.  Introduction  of  z  adds  0  to  the 

~~P 

solution.  It  may  be  noted  that  this  whole  part  merely  adds  0  to  the 
R.H.S.  of  the  equations,  Ax  =  y. 

The  first  component.  By,  of  the  general  solution,  called  a 
particular  solution,  comes  as  follows: 


15 


(13) 


Ax  =  y_ 


I 

p 

!  v 

V  — 

el  = 

is/ 

0 

1  0 
i  _ 

A  — 

_^-P. 

fl 

-_p_ 

i  u  -I 

_j__12 

X 

-IS.. 

i 

i 

0 

'  0 

1  _ 

X 

-n-p 

where  the  suffixes  attached  to  the  column  vectors  indicate  their  dimen¬ 
sions.  (13)  implies  that 


x 

-P 


H.  _x 
12-n- 


P 


Since  we  can  assign  arbitrary  values  to  the  elements  of  the  vector  x. 
can  set  it  to  ^  to  get  a  solution.  This  solution  has  been  termed 
the  particular  solution,  being  given  by 


r  , 

L  1 

is/ 

X 

0 

_ -n-p 

_-n-p- 

Combining  the  two  together,  we  get  the  general  solution  in  the  above 

form. 


IV.  GENERALIZED  INVERSES 
A.  Introductory  Remarks 

It  has  been  observed  in  the  preceding  sections  that  even  when  the 
rank  of  the  coefficient  matrix  A  is  less  than  the  full,  we  can  find 
solutions  to  the  system  of  equations,  Ax_  =  y,  when  the  equations  are 
consistent.  It  has  also  been  observed  that- By  gives  a  solution,  la 
particular  one),  which  is  a  component  part  of  the  general  solution. 
Thus,  B  takes  the  place  of  the  inverse  of  A,  and  may,  therefore,  be 
taken  as  a  pseudo-inverse  or,  more  generally,  as  a  generalized  inverse 
(g-inverse)  of  A.  Some  authors  designate  an  inverse  such  as  B  as  a  con 
ditional  inverse.  There  are,  in  fact,  many  other  pseudo- inverses,  or 
generalized  inverses  (g-inverses)  depending  on  the  properties  these 
inverses  satisfy.  All  of  these  generalized  inverses  are  not  unique  as 
we  have  shown  for  B.  Only  one  of  these  inverses,  the  one  due  to 
Moore-Penrose,  is  unique.  For  this  unique  g-inverse,  we  shall  reserve 
the  symbol  "t",  while  for  other  g-inverses,  we  shall  use 


we 
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The  following  is  an  introduction  to  g-inverses.  The  material  for 

the  first  part  of  this  discussion  is  drawn  from  Rao5,  while  that  for  the 

second  part  from  Greville.^  One  may  refer  to  these  two  references  and 
7.8  4 

also  to  Price  ,  Graybill  and  Banerjee  for  details  and  further  insight. 
B.  AG -inverse  That  is  Not  Unique.  (See  C.  R.  Rao5) 

Definition  of  g-inverse 

A  generalized  inverse  (or  g-inverse)  of  a  matrix  A  of  order  mxn  is 
a  matrix  of  order  nxm,  denoted  A-,  such  that  for  any  y_  for  which 
Ax  =  y_  is  consistent,  x^  =  A~y  is  a  solution. 

Lemma  1.  If  A  is  a  g-inverse,  then  A  A~A  =  A,  and  conversely, 
th 

Choose  y_  as  the  i  column  a_^  of  A.  Then,  the  equation  Ax  =  a^  is 
consistent,  as  ju  lies  in  the  column  space  of  A.  Hence,  x  =  A'a^  is  a 
solution.  That  is,  AA~£i  =  8^,  for  all  columns  ju  of  A.  This  implies 
that  A  A  A  =  A.  Conversely,  if  A”  exists  such  that  A  A"  A  =  A,  and 
Ax  =  y  is  consistent,  then  A  A~  Ax  =  Ax  =  £,  or  A  A~  y  =  y.  Hence, 

A'y  is  a  solution  for  Ax  =  y_.  Thus,  A“  is,  by  definition,  a  g-inverse. 

Lemma  2.  Let  A  A  =  H  for  a  given  g-inverse  A~.  Then 


(a) 

H^  =  H;  i . e. ,  H  is 

idempotent; 

(b). 

AH  =  A. 

Proof 

of 

(a): 

H2  =  A'  A  A"  A  =  A' 

'  A  =  H 

Proof 

of 

(b): 

AH  =  A  A"  A  =  A 

^Rao,  C.R.,  "A  Note  on  a  Generalized  Inverse  of  a  Matrix  with  Appli¬ 
cations  to  Problems  in  Mathematical  Statistics,”  J.  Roy.  Statist. 
Soc.,  B,  24,  152-158,  1962.  - - 

Greville,  T.N.E.,  "The  Pseudo  Inverse  of  a  Rectangular  or  Singular 
Matrix  and  its  Application  to  the  Solution  of  Systems  of  Linear 
Equations,"  SIAM  Review,  Vol.  1,  No.  1,  pp.  38-43,  Jan.  1959. 

7 

Price,  C.  M.,  "The  Matrix  Pseudo  Inverse  and  Minimal  Variance 
■  Estimates,"  SIAM  Review,  Vol.  6,  No.  2.  115-20.  1964. 

g 

Graybill,  F.  A.,  "Theory  and  Application  of  the  Linear  Model," 

Dux bury  Press,  Massacuusetts,  1976. 
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L_emma  3.  A  g-inverse  exists  for  any  matrix  A,  although  it  may  not 
be  unique,  and  it  can  be  constructed  in  such  a  way  that  it  has  A 
itself  as  a  g-inverse.  In  other_words,  it  is  possible  to  find  A~  such 
that  A  A  A  =  A,  and  that  A  A  A-  =  A-. 

Given  a  matrix  A  of  order  mxn  and  rank  s,  there  exist  non-singular 
matrices  P  and  Q  of  orders  m  and  n  respectively,  such  that  PAQ  =  A,  or 

A  =  P'1AQ"1,  where 


and  Dc  is  a  diagonal 
A  =  QA  P,  where 


A  = 


matrix  of 


fD  1 
s  | 

I - 

O  i 

1 

1 

1 _ 

1 

1 

1  o 
_ 1 

1 

1 

O  1 
1 _ ! 

order  s  and  rank 


d~a  1 
s  | 

i - 

O  1 

t - 

O  1 

1 

1 

i 

O  1 
i _ ! 

s. 


This  A  satisfies 


Let  us  define 


(a) .  A  A”  A  =  A 
fb) .  A"  A  A"  =  A'. 

Remark:  If,  in  particular,  A  is  a  symmetric  matrix,  then  Q  =  P*.  It 
may  also  be  pointed  out  the  matrix  P  or  Q  can  be  obtained  as  a  product 
o±  the  elementary  matrices  introduced  earlier. 

Observation:  In  order  that  that  A-  be  unique,  A~  has  to  satisfy,  it  is 
known  5  ,  the  following  two  additional  relationships: 


(c)  .  (A  A“)  '  =  A  A~ 

(d)  .  (A" A)  =  A"  A. 

C.  The  Unique  G-inverse  (See  T.  N.  E.  Greville6) 

For  any  matrix  A,  there  exists,  as  referred  to  above,  a  unique 
g-inverse  A+  such  that 
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1 .  A  A+  A  =  A 

t  t  t 

2.  A  A  At  =  AT 

t  1  + 

3.  (A  A1)  =  A  AT 

4.  (A+  A)  '  =  A+  A  . 


D.  Unique  G-inverse  for  Special  Rectangular  Matrices  B  and  C 

1.  If  any  matrix  B  is  of  dimensions  nxm,  (m  <  n) ,  and  of  rank  m, 

f 

then  B  is  obtained  as 


t  i  _i  i 

Bt  =  (B  B)  B  . 


(14) 


2.  If  any  matrix  C  is  of  dimensions  mxn  (m  <  n) ,  and  of  rank  m, 

+ 

then  C  is  obtained  as 


t  »  »  _i 

C  =  C  (CC  ) 


(15) 


E.  Left  and  Right  Identity  Matrices  for  any  General  Matrix  A. 

Let  Rank  (A)  =  r  <  m,  and  let  B  denote  a  matrix  of  r  columns  whose 

columns  form  a  basis  for  the  column  space  of  A.  [In  particular,  B  might 
be  formed  by  selecting  r  linearly  independent  columns  of  A.)  Also,  let 
C  be  an  r-rowed  matrix  whose  rows  form  a  basis  for  the  row  space  of  A. 

[C  may  be  formed  by  selecting  r  linearly  independent  rows  of  A.)  The 
g-inverses  of  B  and  C  are  given  by  (14)  and  (15)  above.  Then  A  has 
a  unique  left  identity  1^  and  a  unique  right  identity  1^  being  given  by 

■l  ■  BB+ 

*R  *  C+C- 

such  that  I^A  =  A,  and  AI^  =  A.  The  proof  follows  from  the  fact  that 

each  column  of  A  is  a  linear  combination  of  the  columns  of  B,  and  that 

each  row  of  A  is  a  linear  combination  of  the  rows  of  C. 

F  „  Existence  of  a  Unique  G-inverse  of  a  General  Matrix 

1.  An  Observation:  For  any  matrix  A,  there  is  a  unique  matrix 
A  ,  which  has  its  rows  and  columns  in  the  row  space  and  column  space  of 
A  ,  and  also  satisfies  the  equations, 

=  :T  ,  and  A+A  =  I 

L  R- 
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4*  4- 

It  can  be  verified  that  for  B  and  C,  the  matrices  B  and  C  satisfy  the 
above  requirements. 

2.  A  in  General.  To  get  A  in  general,  we  introduce  the  matrix 
G  being  given  by 

G  =  B+  A  C+  . 

From  the  above, 

B  G  C  =  A  . 

4» 

It  may  be  noted  that  G  is  of  rank  r.  We  finally  define  A  as 


t  t  -1  t 
A1  =  CtG  B r 


(See  Greville  for  uniqueness  of  AT.) 


V.  LINEAR  HYPOTHESIS  MODEL  OF  LESS  THAN  THE  FULL  RANK 
A.  The  Problem  of  Estimation 


The  general  linear  hypothesis  model  introduced  in  equation  (2)  is 
of  full  rank,  where  Rank  (X)  =  p.  The  model  will  be  said  to  be  of  less 
than  the  full  rank,  when  Rank  (X)  =  r<p.  Most  of  the  problems  in 
Design  of  Experiments  are  characterized  by  models  of  less  than  the  full 
rank.  Both  under  Case  l(permitting  the  maximum  likelihood  estimation 
procedure)  and  under  Case  2( giving  the  least  squares  estimates)  the  normal 
equations  are  obtained  as 

(X'X)i  =  X’y  .  (16) 

Since  Rank  (X)  =  r,  r<p.  Rank  of  [XfX]  is  also  r.  Hence  (X!X)  ^  does 
not  exist,  as  X*X  is  of  dimension  pxp.  However,  we  can  still  solve  the 
equations  using  the  first  g-inverse  of  XfX  =  S  introduced  earlier,  and 
express  the  general  solution  for  g  as 

£  =  S'X'y  +  (I-H)£  (17) 

where  H  =  S  S. 

-  2  2 

The  least  squares  estimate  o  of  a  ,  and  the  maximum  likelihood 
"2  2 

estimate  o  of  o  (adjusted  for  the  bias)  are  the  same,  being  given  by 

°2  =  [(y-xjL)  =  hit  08) 

where  ^  represents  any  solution  of  equation  (16)  given  above  in  (17). 
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Since  Rank  (S)  =  r,  it  is  not  possible  to  provide  a  unique  esti¬ 
mate  for  each  element  of  3_.  However,  it  is  possible  to  provide  a  best 
(in  the  sense  of  minimum  variance)  and  unbiased  estimate  of  an  estimable 
linear  function  of  the  elements  of  3_.  It  should  be  pointed  out  that  all 
possible  linear  functions  of  the  elements  of  3_ are  not  estimable.  If  it 
were  so,  every  element  of  3_  would  have  been  estimable.  There  are  many 
equivalent,  necessary  and  sufficient,  condition^  that  would  make  a  linear 
function  of  the  parameters  such  as  IB  (where  X  is  a  row  vector  of 
constants)  estimable.  We  provide  below  a  few  of  such  necessary  and  suffi¬ 
cient  conditions.  For  further  details,  one  may  refer  to  Graybill8. 

A  linear  combination  of  the  parameters,  X_  3_,  is  estimable,  if  and 
only  if: 

I 

1  .  X_  is  a  row  vector  of  X,  or  a  linear  combination  of  the  row 

t 

vectors  of  X.  In  other  words,  X_  is  in  the  row  space  of  X. 

2  .  The  equations  Sr  =  X_  are  consistent,  where  S  =  X'X.  That  is, 

a  solution  r  exists  for  the  equations. 

1  .  I  I 

3.X  is  of  the  form  X  =  X  H,  where  H  =  S'S,  and  S'  is  the  first 
g-inverse  of  S  =  X'X,  that  is,  the  g-inverse  that  satisfies  only 
the  condition  SS'S  =  S. 

t  A 

B.  On  the  Estimate,  X  3 


Although  the  normal  equations  (16)  will  have  infinitely  many  solu- 

•  ^  f  A 

tions  for  3.  implying  that  3_  is  not  unique,  the  estimate  X  3  of  the 

•  I  |  A 

estimable  function  X_  3_  is  unique.  Also,  X_  3_  is  unbiased,  which  can  be 
shown  using  the  first  g-inverse  which  is  more  general  than  the  unique 
g-inverse. 

1.  Unbiased:  E[x'eJ  =  E[X_' ( S~X£  +  (I-H)z)] 

=  E[\S"x\  +  (x'-x'h)  z] 

I  _  I  ff 

=  E[X_  S  X  yj  +0,  since  X^  =  X  H 
=  ^'s'x'x3_  =  X_'s'S3_ 

=  XH  3_ 

=  X *  *  3  . 
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Uni-que:  Let  Rank  (X)  =  r.  Let  it  be  possible  to  rearrange  the 
columns  o±‘  X  with  a  corresponding  rearrangement  of  the  elements  of  6 
in  such  a  manner  that  the  first  r  columns  of  X,  denoted  X,,  be  independent. 
Partitioning  X  as  [X^:X2],  we  have 


[x'x]£  = 

xjxi  1  xjx2 

S11  |_S12 

ir__’ 

K1 

!xi>i 

x'xi  !  xjx2 

£  =  S£  = 

S21  |  S22 

V"r 

=  7  y  = 

!  X2 

J  1 

~T“~ 

x2£ 

where  Rank  (S^)  =  r,  denotes  the  first  r  elements  of  g_,  and  X^y, 

the  first  r  elements  of  X  y_.  Application  of  the  "sweep-out"  procedure 
would  reduce  the  above  equation  to 


Applying  the  sweep-out  procedure  still  further,  we  should  have 


Fr  !  SU  S121 

<4-  ; 

i—..1 

Sii(xi^)! 

1 

;  o  *  o 

i  6 

L^"rJ 

i 

0 

That  is. 


ri  ] 

’sii(xkr 

o ;  o  ,  g 

1  _  irp-rj  | 

o 

r  lr 


L-^p-r  _ 


sn(xiL) 


(19) 


Since  S^^X^y)  is  unique,  the  particular  solution  given  in  (19) 
is  unique. 
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t  *■ 

Hence,  the  estimate  X  3  reduces  to 


=  xVx'y  +  X_'(I-H)  z  =  x'sVy 


-  !  f  _  I  t 

where  S  X  y  is  the  particular  solution.  The  above  reduces  to  X_  Sn  (X.y), 

f  f  f  "1  f  1 1  1 

where  X.r  represents  the  first  r  elements  of  .  X^S^fXjy)  is  unique. 

t  * 

C.  Variance  of  the  Estimate  X  3 


V(X_'£)  =  V(x's"x'y) 

=  (X_  S  X  X  S  X)cr2,  since  (S)  =  (S  )~  =  S~ 

=  (x's'S  S"x)o2  =  [xV(xVs)']a2 

=  [x's“(x'H)']a2  =  (x's"x)a2  .  (20) 

We  have  seen  above  that  the  estimate  X*3  reduces  to  (x's"1  x'  y) . 

i a  — r  ll  l  — 

Hence,  the  variance  of  X  3  will  reduce  to 


_  *  _1  »  f  _  I  ft  O 

[<!*  s„  x,)  t»r  Sj{  xp  ]</ 

-  <4  sn  xi  X1  su  V°2 
1  -1  2 

=  CXr  X_r)o  which  is  unique. 


D.  On  the  Estimate  a  Which  is  Unique  and  Unbiased 


While  the  normal  equations  (16)  will  provide  infinitely  many 

*  -2  A 

solutions  for  the  estimate  a  is  unique,  although  it  contains  3. 

A2  . 

a  is  given  by 

*  2  i  t  *  t  t 

°  =  i.  ~  i  x  zJ  •  un 

A  1  f  !  A  f 

In  equation  (21),  3  X  y  =  (y  X) 3  is  unique,  since  (y  X) 3  is  of  the 

f  A  |  *-  - 

form  X  _3,  where  X  is  a  combination  of  the  row  vectors  of  X  and  thus 
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1  I A  •  I  _  I 

A_  g_  is  estimable  with  a  unique  estimate  £  =  (y  X)  (X  X)  X  y.  Hence 
(21)  reduces  to 


=  C 


1 

n-r 


-  ( 


1 

n-r 


=  c 


1 

n-r 


(sV  z’n  - 

) [xp  +  ej  [I  -  xs'x'] 

)  [3_* X *  (I  -  XS~x')X£  + 

)  [B.  (x'x  -  x'xs'x'x)B. 


xs"x']  y_ 

[XB  +  ej 

2e'(I  -  XS'x')Xf3_  +  e'(I  -  XS“x')e] 

+  2e'(I  -  XS~x')X£  +  e'(I  -  XS~x')~e] 


The  first  component  of  the  above  expression  is  0.  If  we  take  expec¬ 
tation  of  the  remaining  terms,  the  second  term  will  be  0.  The  expec¬ 
tation  of  the  third  term  which  is  a  quadratic  form  in  e  with  mean 

2  — 

0.  and  variance  a  I  will  be  equal  to  (by  a  well  known  theorem,  see  [8]). 

a2  Trace  [I  -  X  S"x'] 

=  a2  (n  -  Trace  X  S~x') 

2  -  2 
=0  (n  -  Trace  S  S  )  =  a  (n  -  Trace  H) 

=  a2  (n-r)  . 

^2 

Hence,  the  estimate  a  is  unbiased. 


E.  Scheffe!s  Theorem 

Sche£fe*s  theorem  as  given  in  (8)  will  reduce,  when  Rank  (X)  =  r,  to 


^'•SCTi<^<^  +  sCTi  >  (22) 


where  S 


[qF 


a;  q,  n-r 


],  q  <  r.  The  change  of  (n-p)  to  (n-r)  should 


be  noted.  It  should  also  be  noted  that  \p  is  now  of  the  form  X  (3*  where 

II  - 

X  =  X  H„  In  the  light  of  what  has  been  provided  in  this  section, 

2  ,,  '„-l,  *2 


a 


(X  S.7X  )a 
— r  11— r 


In  Design  of  Experiments,  the  function  ip  =  X  (3  is  often  required 
to  be  of  the  form  of  a  contrast,  and  we  may  be  interested  in  all 
possible  contrasts  which  are  mutually  orthogonal. 
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A  contrast  among  the  parameters  is  defined  to  be  a  linear  function 

of  the  parameters,  \  c.g.,  such  that  ?  c.  =  0. 

i=l  11  i=l  1 


Two  contrasts  i|>.  =  £  c.g.,  ty  =  )  d.g.  are  said  to  be  ortho- 

i=l  1  1  z  i=l  1  1 

gonal,  if  and  only  if  ^  c.d.  =  0. 

i=l  1  1 

If  we  are  interested  in  all  possible  contrasts  which  are  mutually 
orthogonal,  q  will  change  to  (r-1),  because  we  can  only  think  of  (r-1) 
mutually  othogonal  contrasts  from  a  space  of  rank  r. 

If  a  set  of  q  linear  functions  Ag,  where  A  is  of  dimension  qxp, 

are  individually  estimable,  their  linear  compound  <p  =  Z_  Ag.,  where  Z 
is  a  qxl  column  vector  of  constants,  will  also  be  estimable.  The  ~ 

variance  of  <f>  will  be  given  by 

a|  =  a2 [ C a'z) '  S'  (a'z)].  (23) 

Hence,  we  may  also  have  an  analog  of  the  formula  given  in  (9) . 

It  may  be  noted  that  if  one  is  interested  in  contrasts  of  the 

type  (g^  -  g^,),  then  one  may  use  the  confidence  intervals  given  by 

Tukey  (see  Scheffe  and  Graybill^),  as  such  intervals  would  give  shorter 
lengths. 
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F.  An  Illustration 


We  provide  below  a  simple  example  illustrating  the  use  of  Scheffes 
theorem  as  given  in  (22) .  The  example  is  drawn  from  an  experiment  in  which 
it  was  desired  to  find  the  contributions  of  three  factors  represented  by 
3j,  3 2  and  3j‘  Y  =  X3  is  obtained  as 

X3  =  Y 


Measurements 


i  i  i 

■  n 

ei 

7.8 

l  l  l 

8.0  i 

1-10 

0.8 

1-10 

63  . 

1 

0.4 

The  4x3  design  matrix  X  is  of  rank  2.  Hence,  n=4,  r=2,  p=3. 

*  i 

For  the  normal  equations  S3  =  X  Y,  we  have 


r4 

0 

"  1 

2 

31 

17.0 

1 

!o 

4 

i 

i 

2 

A 

^2  :  = 

1 

14.6 

I2 

2 

2 

A 

L  ^3  - 

1 

j  15.8 

L  _ 

The  first  g-inverse  of  S 

is  obtained  as 

r  i  „  h 

,r  i 

4  0  0 

1  0  J 

S"  =  B  = 

0  i  0 

and  H  =  S"S  = 

o  i  i 

1  1 

2  ‘  2  1 

y 

0  0  0 

The  particular  solution  is  obtained  as 


3  =  b(x'y) 


4.25 

3.65 

0 
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An  estimable  parametric  function  will  obviously  be  given  by  X*  =  [1  o  |], 
as  it  satisfied  X  =  X  H. 


t  A 

Hence,  X  3  = 


4.25 

3.65 

0 


4.25  =  \p  . 


.  ^2 

Again,  a  is  estimated  by 


1  t  A|  t 

HIF  Y  -  3  X  Y] 


=  j  [125.64  -  125.54] 
=  .05  . 


Taking  q  =  2  and  a  =  .05,  we  have  qFa;  q>n_r  =  qFa;2j2  =  S2  =  2  x  19.00 

=  38.00,  S  =  6.16,  and  Var  (40  =  Var  (A*  3)  =  a2X^.  S"jxr  =  a2/ 4. 

Substituting  .05  for  a2,  we  have,  for  Var  (A* 3)  =  .0125.  Hence  the  95 
percent  confidence  bounds  are  given  by  [4.25  ±  6.16  (.11)]  -*  (3.57,  4.93). 

In  the  above,  X  3  is  of  the  form  3^  +  3^/2.  Depending  on  the 
necessity,  we  could  also  work  with  the  estimable  functions,  such  as 
32  -  3j_,  3j  +  32  +  33,  etc. 
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